Dynamic scheduling of web api calls

ABSTRACT

In some embodiments, a method may involve a computing system sending, to a client device, first data indicating that the client device is authorized to send an application programming interface (API) call to the computing system during at least a first time slot. The computing system may receive, from the client device, a first API call during the first time slot, and may process the first API call. In some embodiments, a method may involve a computing system sending, to a client device, first data indicating that the client device is prohibited from sending an API call to the computing system during at least a first time slot. The computing system may receive, from the client device, a first API call during a second time slot that is different than the first time slot, and may process the first API call.

BACKGROUND

Various systems have been developed that allow client devices to access applications and/or data files over a network. Certain products offered by Citrix Systems, Inc., of Fort Lauderdale, Fla., including the Citrix Workspace™ and Citrix ShareFile® families of products, provide such capabilities. Some such systems employ applications or services that can be accessed over the internet via Web application programming interface (Web API) calls from client devices or systems.

SUMMARY

This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features, nor is it intended to limit the scope of the claims included herewith.

In some of the disclosed embodiments, a method involves sending, from a computing system to a client device, a first token that includes first data indicating that the client device is authorized to send an application programming interface (API) call to the computing system during at least a first time slot, the first token including a first signature based on the first data and a private key of the computing system. The computing system receives a first API call during the first time slot, and determines that the first API call includes a second token, the second token including second data and a second signature. The computing system uses the private key and the second data to determine that the second signature is valid, and determines that the second data indicates that the client device was authorized to send the first API call to the computing system during the first time slot. The computing system processes the first API call based at least in part the second data indicating that the client device was authorized to send the first API call to the computing system during the first time slot.

In some disclosed embodiments, a method involves sending, from a computing system to a client device, first data indicating that the client device is authorized to send an application programming interface (API) call to the computing system during at least a first time slot; receiving, by the computing system and from the client device, a first API call during the first time slot; and processing, by the computing system, the first API call.

In some disclosed embodiments, a method involves sending, from a computing system to a client device, first data indicating that the client device is prohibited from sending an API call to the computing system during at least a first time slot; receiving, by the computing system and from the client device, a first API call during a second time slot that is different than the first time slot; and processing, by the computing system, the first API call.

In some disclosed embodiments, a system comprises at least one processor, and at least one computer-readable medium encoded with instructions which, when executed by the at least one processor, cause the system to send, to a client device, first data indicating that (A) the client device is authorized to send an application programming interface (API) call to the system during at least a first time slot, or (B) the client device is prohibited from sending an API call to the system during at least a second time slot that is different than the first time slot, to receive, from the client device, a first API call during the first time slot, and to process, by the system, the first API call.

BRIEF DESCRIPTION OF THE DRAWINGS

Objects, aspects, features, and advantages of embodiments disclosed herein will become more fully apparent from the following detailed description, the appended claims, and the accompanying figures in which like reference numerals identify similar or identical elements. Reference numerals that are introduced in the specification in association with a figure may be repeated in one or more subsequent figures without additional description in the specification in order to provide context for other features, and not every element may be labeled in every figure. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments, principles and concepts. The drawings are not intended to limit the scope of the claims included herewith.

FIG. 1 is a high-level illustration of an intelligent Web API call management system configured in accordance with some aspects of the present disclosure;

FIG. 2 is a diagram of a network environment in which some embodiments of the intelligent Web API call management system disclosed herein may deployed;

FIG. 3 is a block diagram of a computing system that may be used to implement one or more of the components of the computing environment shown in FIG. 2 in accordance with some embodiments;

FIG. 4 is a schematic block diagram of a cloud computing environment in which various aspects of the disclosure may be implemented;

FIG. 5 is block diagram illustrating example components that maybe used to implement some embodiment of the intelligent Web API call management system shown in FIG. 1;

FIG. 6 illustrates features of an example policy token that may be employed to communicate call policy information to clients in accordance with some embodiments of the present disclosure;

FIG. 7 is a flowchart illustrating an example routine that may be performed by the gateway shown in FIG. 5 in accordance with some embodiments of the present disclosure; and

FIG. 8 shows an example table that may be maintained and used by the gateway shown in FIG. 5 in connection with the routine shown in FIG. 7.

DETAILED DESCRIPTION

For purposes of reading the description of the various embodiments below, the following descriptions of the sections of the specification and their respective contents may be helpful:

Section A provides an introduction to example embodiments of an intelligent Web API call management system configured in accordance with some aspects of the present disclosure;

Section B describes a network environment which may be useful for practicing embodiments described herein;

Section C describes a computing system which may be useful for practicing embodiments described herein;

Section D describes embodiments of systems and methods for accessing computing resources using a cloud computing environment;

Section E provides a more detailed description of example embodiments of the intelligent Web API call management system introduced in Section A; and

Section F describes example implementations of methods, systems/devices, and computer-readable media in accordance with the present disclosure.

A. Introduction to Illustrative Embodiments of an Intelligent Web API Call Management System

Oftentimes services expose capabilities to clients as Web APIs. Clients may need to call such Web APIs periodically (hourly, daily, weekly, etc.) to achieve some business intent, e.g., to get a lease for using a business capability or to renew an existing lease. Other examples of such an API interaction pattern could be periodically reporting the health/presence of a client device (beacon), polling for some information from a server, refreshing an access token, etc. With this kind of uncoordinated or otherwise random pattern of interaction, it is a challenge to service API traffic during a particular time window. Spike(s) or overload of traffic at any one time may result in either outages or over-provisioning on the server side during some time periods and under-utilization during other time periods. This problem can manifest in any circumstance in which a group of N-clients makes Web API calls to a service periodically.

One approach for addressing this issue is to implement rate-limiting of API calls on the server side. Such a measure can help prevent “bad” clients from bombarding the service with requests by limiting the number of calls they can make within a given time period. However, because clients independently determine the times at which individual API calls are sent, this approach cannot prevent a large group of “good” clients from making legitimate calls (i.e., within their rate-limit) at the same time and thus overloading the server.

As used herein, the term “good clients” refers to the clients that access the service in good faith by following the service specification, purpose, and guidance. The term “bad clients,” on the other hand, refers to the clients that abuse and put additional stress on the service by not following the service specification, purpose and/or guidance. In some circumstances, bad clients may even deliberately perpetrate a denial of service (DoS) cyber-attack by bombarding a service with API calls in an effort to crash the service and/or prevent good clients from accessing it. It should be appreciated, however, that bad clients are not necessarily unauthorized clients trying to get access to the service. They may also be authenticated and authorized clients making unnecessary requests too often, out-of-order, etc.

On the server side, it may be possible to do optimizations like pre-processing/caching etc., so that each request can be processed faster and thereby increase throughput. A service may also bring additional resources on line to accommodate periods of increased client requests. Such approaches do not, however, prevent multiple clients from issuing legitimate requests to the service at the same time, so the above problem of overloading the server still remains.

Existing server-side solutions tend to be reactive in nature because the servers either drop calls when their capacity (e.g., a threshold) is reached or elastically scale to be able to meet the increased need at a future time but do not allow the processing of current calls. None of them take steps to proactively prevent the problem.

On the client side, a possible approach is for individual clients to randomize the times at which they send requests. Given that individual clients make the decision when to send their respective requests in isolation, however, such an approach may or may not help because it is still possible for a large number of clients to issue Web API calls at the same time. In short, present client side solutions do not solve the above-described problem because the receipt of such calls is still random and uncoordinated, which can cause difficulty in their processing.

Offered is a system in which a web service can exercise at least a minimal degree of control over the timing of Web API calls that it receives from respective clients, thus enabling the service to coordinate or otherwise orchestrate such calls to be spread over time and minimize the likelihood that the service will receive an unduly large number of API calls at the same time. FIG. 1 shows a high-level implementation of an intelligent Web API call management system 100 in which a web service 102 (e.g., implemented by one or more servers 204, examples of which are described below) may exercise control over the timing of Web API calls 104 that are sent by one or more clients 202, examples of which are described below, in such a manner.

As shown, in some implementations, the web service 102 may provide call policy information 106 to respective client devices 202 that specifies one or more policies concerning times at which and/or circumstances under which the client devices 202 are authorized to send the Web API calls 104 to the web service 102. As explained in more detail below, the call policy information 106 sent to respective clients may be dynamically determined by the web service 102 so as to ensure that the client devices 202 collectively distribute their Web API calls 104 over time, thus minimizing the risk that the web service 102 will be inundated by an unduly large number of Web API calls 104 at the same time. In some implementations, for example, the web service 102 may set the call policy information 106 so that respective clients 202 are informed of particular allowed time slots, periods, intervals, phases, openings and the like (alternatively referred to herein as “allowed slots”) during which they are authorized to send Web API calls 104 to the web service 102 and/or of particular blocked time slots or the like (alternatively referred to herein as “blocked slots”) during which they are prohibited from sending Web API calls 104 to the web service 102.

By keeping track of such slot assignments as they are made, the web service 102 may maintain a forward looking schedule indicating the approximate numbers of Web API calls that are expected to occur, assuming the clients follow the call policy information 106 sent to them, during respective time slots within some future period of time (e.g., for one hour slots during the following twenty-four hours), and can set the call policy information 106 that is subsequently sent to clients 202 to ensure that the total numbers of expected calls for the different time slots on the schedule are approximately the same, or are otherwise appropriately distributed amongst the future time slots in view of the expected capacity of the system during such time slots. In some implementations, the web service 102 may further track historical data concerning the numbers of Web API calls 104 that are actually processed during respective historical time slots, and may further use that historical data to determine how to best scale the system to meet expected demands during corresponding time slots in the future, and/or to determine how the system is likely to be scaled during particular time slots for purposes of determining how best to set the call policy information 106, as discussed above.

In some implementations, the call policy information 106 may be included as a part of the API responses (e.g., as an attribute in the header or body of the response) the web service 102 sends to the clients 202 in response to respective Web API calls 104. As explained in more detail below, for example, in some implementations the call policy information 106 may be included within a token (e.g., a JSON Web Token (JWT)) or other credential, data fragment, etc., that may be appended to such API responses. In other implementations, the web service 102 service may additionally or alternatively expose a dedicated endpoint from which the clients 202 may retrieve the call policy information 106. Allowing the client devices 202 to retrieve the call policy information 106 on demand from a dedicated endpoint may be more suitable for relatively static situations, where the call policy information 106 does not change significantly over time. Including the call policy information 106 within respective API responses, on the other hand, may make it easier for the web service 102 to dynamically change the call policy information 106 after receipt of individual Web API calls 104 and adapt the call policy information 106 to current load, bandwidth, resource availability, etc.

FIG. 1 further shows an example routine 108 that may be performed by one or more components of the web service 102 in accordance with some embodiments of the present disclosure. As shown, at a step 110 of the routine 108, a computing system (e.g., the web service 102) may send first data (e.g., the call policy information 106) to a client device 202. The first data (e.g., the call policy information 106) may indicate that (A) the client device 202 is authorized to send an API call (e.g., a Web API call 104) to the computing system (e.g., the web service 102) during at least a first time slot (e.g., an allowed slot) and/or (B) the client device 202 is prohibited from sending an API call to the computing system (e.g., the web service 102) during at least a second time slot (e.g., a blocked slot) that is different than the first time slot.

At a step 112 of the routine 108, the computing system (e.g., the web service 102) may receive, from the client device 202, a first API call (e.g., a Web API call 104) during the first time slot (e.g., an allowed slot and/or a non-blocked slot).

At a step 114 of the routine 108, the computing system (e.g., the web service 102) may process the first API call (e.g., a Web API call 104).

Additional details and example implementations of embodiments of the present disclosure are set forth below in Section F, following a description of example systems and network environments in which such embodiments may be deployed.

B. Network Environment

Referring to FIG. 2, an illustrative network environment 200 is depicted. As shown, the network environment 200 may include one or more clients 202(1)-202(n) (also generally referred to as local machine(s) 202 or client(s) 202) in communication with one or more servers 204(1)-204(n) (also generally referred to as remote machine(s) 204 or server(s) 204) via one or more networks 206(1)-206(n) (generally referred to as network(s) 206). In some embodiments, a client 202 may communicate with a server 204 via one or more appliances 208(1)-208(n) (generally referred to as appliance(s) 208 or gateway(s) 208). In some embodiments, a client 202 may have the capacity to function as both a client node seeking access to resources provided by a server 204 and as a server 204 providing access to hosted resources for other clients 202.

Although the embodiment shown in FIG. 2 shows one or more networks 206 between the clients 202 and the servers 204, in other embodiments, the clients 202 and the servers 204 may be on the same network 206. When multiple networks 206 are employed, the various networks 206 may be the same type of network or different types of networks. For example, in some embodiments, the networks 206(1) and 206(n) may be private networks such as local area network (LANs) or company Intranets, while the network 206(2) may be a public network, such as a metropolitan area network (MAN), wide area network (WAN), or the Internet. In other embodiments, one or both of the network 206(1) and the network 206(n), as well as the network 206(2), may be public networks. In yet other embodiments, all three of the network 206(1), the network 206(2) and the network 206(n) may be private networks. The networks 206 may employ one or more types of physical networks and/or network topologies, such as wired and/or wireless networks, and may employ one or more communication transport protocols, such as transmission control protocol (TCP), internet protocol (IP), user datagram protocol (UDP) or other similar protocols. In some embodiments, the network(s) 206 may include one or more mobile telephone networks that use various protocols to communicate among mobile devices. In some embodiments, the network(s) 206 may include one or more wireless local-area networks (WLANs). For short range communications within a WLAN, clients 202 may communicate using 802.11, Bluetooth, and/or Near Field Communication (NFC).

As shown in FIG. 2, one or more appliances 208 may be located at various points or in various communication paths of the network environment 200. For example, the appliance 208(1) may be deployed between the network 206(1) and the network 206(2), and the appliance 208(n) may be deployed between the network 206(2) and the network 206(n). In some embodiments, the appliances 208 may communicate with one another and work in conjunction to, for example, accelerate network traffic between the clients 202 and the servers 204. In some embodiments, appliances 208 may act as a gateway between two or more networks. In other embodiments, one or more of the appliances 208 may instead be implemented in conjunction with or as part of a single one of the clients 202 or servers 204 to allow such device to connect directly to one of the networks 206. In some embodiments, one of more appliances 208 may operate as an application delivery controller (ADC) to provide one or more of the clients 202 with access to business applications and other data deployed in a datacenter, the cloud, or delivered as Software as a Service (SaaS) across a range of client devices, and/or provide other functionality such as load balancing, etc. In some embodiments, one or more of the appliances 208 may be implemented as network devices sold by Citrix Systems, Inc., of Fort Lauderdale, Fla., such as Citrix Gateway™ or Citrix ADC™.

A server 204 may be any server type such as, for example: a file server; an application server; a web server; a proxy server; an appliance; a network appliance; a gateway; an application gateway; a gateway server; a virtualization server; a deployment server; a Secure Sockets Layer Virtual Private Network (SSL VPN) server; a firewall; a web server; a server executing an active directory; a cloud server; or a server executing an application acceleration program that provides firewall functionality, application functionality, or load balancing functionality.

A server 204 may execute, operate or otherwise provide an application that may be any one of the following: software; a program; executable instructions; a virtual machine; a hypervisor; a web browser; a web-based client; a client-server application; a thin-client computing client; an ActiveX control; a Java applet; software related to voice over internet protocol (VoIP) communications like a soft IP telephone; an application for streaming video and/or audio; an application for facilitating real-time-data communications; a HTTP client; a FTP client; an Oscar client; a Telnet client; or any other set of executable instructions.

In some embodiments, a server 204 may execute a remote presentation services program or other program that uses a thin-client or a remote-display protocol to capture display output generated by an application executing on a server 204 and transmit the application display output to a client device 202.

In yet other embodiments, a server 204 may execute a virtual machine providing, to a user of a client 202, access to a computing environment. The client 202 may be a virtual machine. The virtual machine may be managed by, for example, a hypervisor, a virtual machine manager (VMM), or any other hardware virtualization technique within the server 204.

As shown in FIG. 2, in some embodiments, groups of the servers 204 may operate as one or more server farms 210. The servers 204 of such server farms 210 may be logically grouped, and may either be geographically co-located (e.g., on premises) or geographically dispersed (e.g., cloud based) from the clients 202 and/or other servers 204. In some embodiments, two or more server farms 210 may communicate with one another, e.g., via respective appliances 208 connected to the network 206(2), to allow multiple server-based processes to interact with one another.

As also shown in FIG. 2, in some embodiments, one or more of the appliances 208 may include, be replaced by, or be in communication with, one or more additional appliances, such as WAN optimization appliances 212(1)-212(n), referred to generally as WAN optimization appliance(s) 212. For example, WAN optimization appliances 212 may accelerate, cache, compress or otherwise optimize or improve performance, operation, flow control, or quality of service of network traffic, such as traffic to and/or from a WAN connection, such as optimizing Wide Area File Services (WAFS), accelerating Server Message Block (SMB) or Common Internet File System (CIFS). In some embodiments, one or more of the appliances 212 may be a performance enhancing proxy or a WAN optimization controller.

In some embodiments, one or more of the appliances 208, 212 may be implemented as products sold by Citrix Systems, Inc., of Fort Lauderdale, Fla., such as Citrix SD-WAN™ or Citrix Cloud™. For example, in some implementations, one or more of the appliances 208, 212 may be cloud connectors that enable communications to be exchanged between resources within a cloud computing environment and resources outside such an environment, e.g., resources hosted within a data center of + an organization.

C. Computing Environment

FIG. 3 illustrates an example of a computing system 300 that may be used to implement one or more of the respective components (e.g., the clients 202, the servers 204, the appliances 208, 212) within the network environment 200 shown in FIG. 2. As shown in FIG. 3, the computing system 300 may include one or more processors 302, volatile memory 304 (e.g., RAM), non-volatile memory 306 (e.g., one or more hard disk drives (HDDs) or other magnetic or optical storage media, one or more solid state drives (SSDs) such as a flash drive or other solid state storage media, one or more hybrid magnetic and solid state drives, and/or one or more virtual storage volumes, such as a cloud storage, or a combination of such physical storage volumes and virtual storage volumes or arrays thereof), a user interface (UI) 308, one or more communications interfaces 310, and a communication bus 312. The user interface 308 may include a graphical user interface (GUI) 314 (e.g., a touchscreen, a display, etc.) and one or more input/output (I/O) devices 316 (e.g., a mouse, a keyboard, etc.). The non-volatile memory 306 may store an operating system 318, one or more applications 320, and data 322 such that, for example, computer instructions of the operating system 318 and/or applications 320 are executed by the processor(s) 302 out of the volatile memory 304. Data may be entered using an input device of the GUI 314 or received from I/O device(s) 316. Various elements of the computing system 300 may communicate via communication the bus 312. The computing system 300 as shown in FIG. 3 is shown merely as an example, as the clients 202, servers 204 and/or appliances 208 and 212 may be implemented by any computing or processing environment and with any type of machine or set of machines that may have suitable hardware and/or software capable of operating as described herein.

The processor(s) 302 may be implemented by one or more programmable processors executing one or more computer programs to perform the functions of the system. As used herein, the term “processor” describes an electronic circuit that performs a function, an operation, or a sequence of operations. The function, operation, or sequence of operations may be hard coded into the electronic circuit or soft coded by way of instructions held in a memory device. A “processor” may perform the function, operation, or sequence of operations using digital values or using analog signals. In some embodiments, the “processor” can be embodied in one or more application specific integrated circuits (ASICs), microprocessors, digital signal processors, microcontrollers, field programmable gate arrays (FPGAs), programmable logic arrays (PLAs), multi-core processors, or general-purpose computers with associated memory. The “processor” may be analog, digital or mixed-signal. In some embodiments, the “processor” may be one or more physical processors or one or more “virtual” (e.g., remotely located or “cloud”) processors.

The communications interfaces 310 may include one or more interfaces to enable the computing system 300 to access a computer network such as a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or the Internet through a variety of wired and/or wireless connections, including cellular connections.

As noted above, in some embodiments, one or more computing systems 300 may execute an application on behalf of a user of a client computing device (e.g., a client 202 shown in FIG. 2), may execute a virtual machine, which provides an execution session within which applications execute on behalf of a user or a client computing device (e.g., a client 202 shown in FIG. 2), such as a hosted desktop session, may execute a terminal services session to provide a hosted desktop environment, or may provide access to a computing environment including one or more of: one or more applications, one or more desktop applications, and one or more desktop sessions in which one or more applications may execute.

D. Systems and Methods for Delivering Shared Resources Using a Cloud Computing Environment

Referring to FIG. 4, a cloud computing environment 400 is depicted, which may also be referred to as a cloud environment, cloud computing or cloud network. The cloud computing environment 400 can provide the delivery of shared computing services and/or resources to multiple users or tenants. For example, the shared resources and services can include, but are not limited to, networks, network bandwidth, servers, processing, memory, storage, applications, virtual machines, databases, software, hardware, analytics, and intelligence.

In the cloud computing environment 400, one or more clients 202 (such as those described in connection with FIG. 2) are in communication with a cloud network 404. The cloud network 404 may include back-end platforms, e.g., servers, storage, server farms and/or data centers. The clients 202 may correspond to a single organization/tenant or multiple organizations/tenants. More particularly, in one example implementation, the cloud computing environment 400 may provide a private cloud serving a single organization (e.g., enterprise cloud). In another example, the cloud computing environment 400 may provide a community or public cloud serving multiple organizations/tenants.

In some embodiments, a gateway appliance(s) or service may be utilized to provide access to cloud computing resources and virtual sessions. By way of example, Citrix Gateway, provided by Citrix Systems, Inc., may be deployed on-premises or on public clouds to provide users with secure access and single sign-on to virtual, SaaS and web applications. Furthermore, to protect users from web threats, a gateway such as Citrix Secure Web Gateway may be used. Citrix Secure Web Gateway uses a cloud-based service and a local cache to check for URL reputation and category.

In still further embodiments, the cloud computing environment 400 may provide a hybrid cloud that is a combination of a public cloud and one or more resources located outside such a cloud, such as resources hosted within one or more data centers of an organization. Public clouds may include public servers that are maintained by third parties to the clients 202 or the enterprise/tenant. The servers may be located off-site in remote geographical locations or otherwise. In some implementations, one or more cloud connectors may be used to facilitate the exchange of communications between one more resources within the cloud computing environment 400 and one or more resources outside of such an environment.

The cloud computing environment 400 can provide resource pooling to serve multiple users via clients 202 through a multi-tenant environment or multi-tenant model with different physical and virtual resources dynamically assigned and reassigned responsive to different demands within the respective environment. The multi-tenant environment can include a system or architecture that can provide a single instance of software, an application or a software application to serve multiple users. In some embodiments, the cloud computing environment 400 can provide on-demand self-service to unilaterally provision computing capabilities (e.g., server time, network storage) across a network for multiple clients 202. By way of example, provisioning services may be provided through a system such as Citrix Provisioning Services (Citrix PVS). Citrix PVS is a software-streaming technology that delivers patches, updates, and other configuration information to multiple virtual desktop endpoints through a shared desktop image. The cloud computing environment 400 can provide an elasticity to dynamically scale out or scale in response to different demands from one or more clients 202. In some embodiments, the cloud computing environment 400 may include or provide monitoring services to monitor, control and/or generate reports corresponding to the provided shared services and resources.

In some embodiments, the cloud computing environment 400 may provide cloud-based delivery of different types of cloud computing services, such as Software as a service (SaaS) 402, Platform as a Service (PaaS) 404, Infrastructure as a Service (IaaS) 406, and Desktop as a Service (DaaS) 408, for example. IaaS may refer to a user renting the use of infrastructure resources that are needed during a specified time period. IaaS providers may offer storage, networking, servers or virtualization resources from large pools, allowing the users to quickly scale up by accessing more resources as needed. Examples of IaaS platforms include AMAZON WEB SERVICES provided by Amazon.com, Inc., of Seattle, Wash., Azure IaaS provided by Microsoft Corporation or Redmond, Wash., RACKSPACE CLOUD provided by Rackspace US, Inc., of San Antonio, Tex., Google Compute Engine provided by Google Inc. of Mountain View, Calif., and RIGHTSCALE provided by RightScale, Inc., of Santa Barbara, Calif.

PaaS providers may offer functionality provided by IaaS, including, e.g., storage, networking, servers or virtualization, as well as additional resources such as, e.g., the operating system, middleware, or runtime resources. Examples of PaaS include WINDOWS AZURE provided by Microsoft Corporation of Redmond, Wash., Google App Engine provided by Google Inc., and HEROKU provided by Heroku, Inc. of San Francisco, Calif.

SaaS providers may offer the resources that PaaS provides, including storage, networking, servers, virtualization, operating system, middleware, or runtime resources. In some embodiments, SaaS providers may offer additional resources including, e.g., data and application resources. Examples of SaaS include GOOGLE APPS provided by Google Inc., SALESFORCE provided by Salesforce.com Inc. of San Francisco, Calif., or OFFICE 365 provided by Microsoft Corporation. Examples of SaaS may also include data storage providers, e.g. Citrix ShareFile® from Citrix Systems, DROPBOX provided by Dropbox, Inc. of San Francisco, Calif., Microsoft SKYDRIVE provided by Microsoft Corporation, Google Drive provided by Google Inc., or Apple ICLOUD provided by Apple Inc. of Cupertino, Calif.

Similar to SaaS, DaaS (which is also known as hosted desktop services) is a form of virtual desktop infrastructure (VDI) in which virtual desktop sessions are typically delivered as a cloud service along with the apps used on the virtual desktop. Citrix Cloud from Citrix Systems is one example of a DaaS delivery platform. DaaS delivery platforms may be hosted on a public cloud computing infrastructure, such as AZURE CLOUD from Microsoft Corporation of Redmond, Wash., or AMAZON WEB SERVICES provided by Amazon.com, Inc., of Seattle, Wash., for example. In the case of Citrix Cloud, Citrix Workspace app may be used as a single-entry point for bringing apps, files and desktops together (whether on-premises or in the cloud) to deliver a unified experience.

E. Detailed Description of Example Embodiments of an Intelligent Web API Call Management System

FIG. 5 shows examples of components that may be included within some implementations of the intelligent Web API call management system 100 introduced above in connection with FIG. 1. In the illustrated example, the web service 102 includes a gateway 502 (e.g., an API gateway) and one or more microservices 504 a, 504 b. As shown, the gateway 502 may be located between the clients 202 and the microservices 504 and thus provide a single-point entry for the clients 202 to the microservices 504. In particular, the gateway 502 may act as a reverse proxy, routing the Web API calls 104 from the clients 202 to the microservices 504. As also shown, the gateway 502 may additionally aggregate and/or route API responses 506 from the microservices 504 to the respective clients 202.

Further, as illustrated, in some implementations, the gateway 502 may be configured to send policy information (e.g., the call policy information 106 introduced above in Section A) together with the API responses 506 that it sends to the respective clients 202, and the clients 202 may be configured to return that same call policy information 106 together with the next Web API calls 104 that they send to the gateway 502 for distribution to the microservices 504. In some implementations, for example, the call policy information 106 may be encoded within tokens (e.g., JWTs), certificates, data fragments, etc., that the gateway 502 appends to or otherwise includes with the API responses 506 it sends to respective clients 202, and the recipient clients 202 may extract those tokens, certificates, data fragments, etc., and append them to or otherwise include them the next Web API calls 104 they send to the gateway 502. Including the call policy information 106 with the API responses 506 may be beneficial, as it may allow the gateway 502 to dynamically change the call policy information 106 after receipt of individual Web API calls 104 and adapt the call policy information 106 to current load, bandwidth, resource availability, etc.

As noted above, however, it should be appreciated that the call policy information 106 may take on any of a number of other forms and/or may be provided to clients 202 in any of a number of other ways without departing from the scope of the present disclosure.

FIG. 6 shows an example policy token 602 (e.g., a JWT) that may be used to covey policy information 106 from the gateway 502 to the clients 202, and vice versa, as well as examples of types of information that may be included within the call policy information 106 in some embodiments. As illustrated, the policy token 602 may include a header 604, a payload 606, and a signature 608. In some implementations, the call policy information 106 may be included in the header 604 and/or the payload 606 (e.g., as one or more claims). Further, in some implementations the header 604 may specify a signing technique that the gateway 502 uses to generate the signature 608 based on the content of the header 604 and the payload 606, as well as a private key. In some implementations, for example, the specified signing technique may involve (A) combining the base64url encoded header and the base64url encoded payload, (B) hashing the combined Based64url value with a hashing technique, e.g., SHA256, and (C) encrypting the determined hash using a private key). As such, by validating the signature 608 using the private key and the specified signing technique, the gateway 502 may be able to confirm that the call policy information 106 included in the policy tokens 602 it receives from clients 202 has not been altered or otherwise compromised by the clients 202. In some implementations, the header 604 or payload of the policy token may additionally include an identifier (e.g., a client identifier (ID)) identifying the client 202 to which it is sent, thus enabling the gateway 502 to confirm that a received policy token 602 came from the same client 202 that such policy token 602 was originally sent to, thus restricting the transfer of the policy tokens 602 amongst clients 202.

The call policy information 106 that is encoded within the policy tokens 602 included with API responses 506, or that is communicated to the clients 202 in some other fashion, may take on any of numerous forms and may specify any of a number of policies to guide the clients 202 to collectively distribute the Web API calls 104 they send to the gateway 502 over time. As shown in FIG. 6, for example, in some implementations, the call policy information 106 may include one or more “next call” policies 610. As the name implies, such next call policies 610 may indicate the circumstances under which a client 202 may send its next Web API call 104 to the web service 102. The next call policies 610 may, for example, allow the recipient client 202 to make its next call (A) during one or more particular time intervals, (B) in response to the occurrence of particular events (e.g., logon, log off, etc.), and/or (C) during specified time intervals following the occurrence of particular events, e.g., during a two hour time period after the client 202 logs on. More specifically, as illustrated, in some implementations, the next call policies 610 may specify one or more “allowed slots” 610 a representing particular time intervals (e.g., one hour time slots during the following twenty-four hour window) during which the client 202 is permitted to send its next Web API call 104 to the web service 102 and/or one or more “blocked slots” 610 b representing time intervals (e.g., one hour time slots during the following twenty-four hour window) during which the client 202 is prohibited from sending its next Web API call 104 to the web service 102.

In some implementations, the allowed slots 610 a specified in the policy token 602 may additionally indicate (e.g., by being different types of allowed slots 610 a, such as “reserved slots” versus “preferred slots,” or by way of indicators or flags associated with the allowed slots 610 a) a degree of confidence that the web service 102 will have sufficient capacity to handle a Web API call 104 sent during the indicated time slot. In some implementations, for example, the individual allowed slots 610 a may represent or include either (1) an indication that the Web service 102 is highly likely to have sufficient capacity to handle a Web API call 104 sent during the indicated time slot, such as by being designated as a “reserved slot,” or (2) an indication that the Web service 102 is uncertain as to whether it will have capacity during the indicated time slot and will simply use its “best efforts” to process a Web API call 104 sent during the indicate time slot, such as by being designated as a “preferred” slot.

Further, as also illustrated, the next call policies 610 may additionally or alternatively include one or more “event triggers” 610 c identifying events (e.g., logging on, logging off, etc.) after or during which the recipient client 202 may send its next Web API call 104 to the web service 102, and/or indicating windows of time keyed off or otherwise triggered from such events during which the recipient client 202 may send its next Web API call 104 to the web service 102.

With respect to the timing of the Web API calls 104 that are sent by clients 202, the inventors have recognized and appreciated that even when clients 202 need to issue Web API calls 104 with the same frequency (e.g., at least once every twenty-four hours), they are not generally required to maintain the same time interval (e.g., twenty-four hours) between successive Web API calls 104. Clients 202 usually have a grace period window and are permitted to space their Web API calls 104 anywhere within that grace period window without affecting business. For example, if a client 202 that is required to send Web API calls 104 to the web service 102 at a frequency of one call every twenty-four hours, it may be acceptable for the client 202 to send first and second calls spaced by twenty-one hours followed by a third call after another twenty-seven hours. Since the web service 102 may have full knowledge of the traffic pattern from the clients 202, the gateway 502 may make intelligent spacing decisions and guide clients 202 to send Web API calls 104 in such way that overloading is prevented and the server capacity is properly utilized. In some implementations, the clients 202 may additionally be configured to randomize the Web API calls they send within allowed slots 610 a so at achieve an even better distribution of load and help avoid large groups of clients 202 making Web API calls 104 at the beginnings of respective time slots.

As also shown in FIG. 6, in some implementations, the call policy information 106 may additionally or alternatively include one or more “retry” policies 612. Once again, as the name implies, the retry policies 612 may indicate the conditions under which a client 202 may re-send Web API calls 104 that were rejected by the gateway 502 (e.g., for failing to comply with the call policy information 106) and/or were unsuccessfully processed by one or more microservices 504. The appropriate use of retry policies 612 may significantly improve the performance of the system 100 and reduce instances in which clients 202 experience a denial of service (DoS) by the web service 102. Even a good client 202 may continue to send additional Web API calls 104 in a circumstance where the web service 102 is busy and cannot process an initial call. This way, a good client may turn into a bad client exactly when the service is under stress, contributing negatively to the on-going situation.

A good client may use a back-off technique or approach and the service can instruct the client 202 (e.g., via the retry policies 612 in the call policy information 106) about how to proceed with retries based on that technique. For example, as shown in FIG. 6, the retry policies 612 may specify a number of retries 612 a that are to be attempted by the back-off technique and/or whether the back-off technique is to perform a linear or exponential back-off 612 b, etc. Further, as also shown, in some implementations, the retry policies 612 may include one or more error codes 612 c indicating a reason the initial Web API call 104 was unsuccessful.

In some implementations, call policy information 106, including the retry policies 612, may also be included with error messages received from the web service 102 (e.g. an HTTP 500). Such an approach may help facilitate the dynamic updating of the next call policies 610 and/or retry policies 612 by the web service 102. In some implementations, for example, the gateway 502 may dynamically adjust either or both of those policies according to the service status/load/active connections/number of running threads, etc. The exception here may be when the web service 102 does not respond at all to a Web API call 104, such as when one or more components of the web service 102 are down or the Web API call 104 is dropped by the network. In some implementations, the clients 202 may be configured to follow the previously received retry policies 612 in such circumstances.

As explained in more detail below, using the call policy information 106 noted above, the clients 202 may be able to cooperate with the web service 102 by collectively distributing their Web API calls 104 over time, thus minimizing the risk that the web service 102 will be inundated by an unduly large number of Web API calls 104 at the same time. Further, in some implementations, additional steps may be taken to make sure the clients 202 actually follow the next call policies 610 and/or retry policies 612 that have been communicated to them. For example, in some implementations, client compliance may be regulated by encapsulating the call policy information 106 into security entities (e.g., cryptographically signed tokens, certificates, data fragments, etc.) that cannot be tampered with and requesting that the clients 202 return such security entities when making respective Web API calls 104. The signed policy token 602 described above in connection with FIG. 6 may be used as such a security entity. Benefits of using a signed policy token 602 such as that describe above include, among other things, that such an entity: (A) cannot be altered by clients 202, (B) can be added in the request/response headers, so the actual request/response data models need not be changed, (C) does no require a lot of resources to be generated, (D) is easy to validate, (E) does not require shared storage in case of a multi-instance service as the signing keys can be shared during service deployment, (F) can be separately from and in addition authorization and authentication policies to reject bad actors in lower layers (e.g., firewalls, middleware, or other components configured to evaluate incoming messages), and (G) is visible to intermediate components like Load balancers or API gateways so traffic/call management can be offloaded to those components. It should be appreciated, however, that in some implementations the policy information 106 may be additionally or alternatively be communicated to clients 202 using other formats, such as by way of data fragments (which may optionally signed using a private key of the gateway 502) that are included within API responses 104 clients, or are otherwise communicated to and/or made available for access by clients 202.

FIG. 7 shows an example routine 700 that may be employed by one or more components of the system 100, or by one or more components that operate in conjunction with the system 100, to implement certain aspects of the present disclosure. In some implementations, for example, one or more, or perhaps all, of the steps of the routine 700 may be implemented by the gateway 502 shown in FIG. 5. In other implementations, one or more of the steps of the routine 700 may additionally or alternatively be performed by one or more other system components, such as one or more load balancers (not illustrated) that manage traffic for the web service 102 and/or by one or more of the microservices 504. Further, it should be appreciated that, in some implementations, one or more, or perhaps all, of the steps of the routine 700 may additionally or alternatively be implemented in one or more lower layers of the web service 102, such as authentication and/or authorization, or perhaps even in one or more higher layers of the web service 102, such as one or more application handlers. Accordingly, the example implementation described below, in which the routine 700 is described as being performed by the gateway 502, should not be viewed as limiting with respect to the component(s) that may be used to implement the routine 700. That said, performing some or all of the steps of the routine 700 by one or more components other than the microservice(s) 504, e.g., the gateway 502 and/or one or more load balancers, may provide certain advantages in some implementations, as doing so may serve to reduce the complexity of the microservice(s) 504 and/or may prevent the microservice(s) 504 from being burdened with additional tasks.

FIG. 8 shows an example table 800 that may be updated and referenced by the gateway 502 (and/or by one or more other system components used to implement the routine 700), as described below, in connection with performing the routine 700. In some implementations, the gateway 502 (and/or one or more other components used to implement the routine 700) may include one or more processors and one or more computer-readable media which, when executed by the one or more processors, cause the gateway 502 (and/or the one or more other components used to implement the routine 700) to perform the actions described herein. In such implementations, the table 800 may be embodied by one or more databases or other storage mediums accessible to the gateway 502 (and/or one or more other components used to implement the routine 700).

As shown in FIG. 7, the routine 700 may begin when, at a decision step 702, the gateway 502 determines that a new Web API call 104 has been received from a client 202, thus causing the routine 700 to proceed to a decision step 704.

At the decision step 704 of the routine 700, the gateway 502 may determine whether the received Web API call 104 includes call policy information 106. In some implementations, for example, the gateway 502 may determine whether the Web API call 104 includes a policy token 602 of the type described above in connection with FIG. 6.

When, at the decision step 704 of the routine 700, the gateway 502 determines that the received Web API call 104 includes call policy information 106, the routine 700 may proceed to a decision step 706, at which the gateway 502 may determine whether the call policy information 106 is authentic, as described below. When, on the other hand, the gateway 502 determines (at the decision step 704) that the Web API call 104 does not include call policy information 106, the routine 700 may instead proceed to a decision step 710, at which the gateway 502 may determine whether to treat the failure to include call policy information 106 in the received Web API call 104 with leniency, i.e., to overlook a reason for non-compliance. As noted below, in the event that the gateway 502 determines (at the decision step 710) to grant leniency to a non-compliant Web API call 104, the routine 700 may proceed to the step 714 at which the gateway 502 may forward the Web API call 104 to an appropriate service (e.g., one of the microservices 504) for processing. Examples of circumstances in which the gateway 502 may determine to grant leniency to a non-compliant Web API call 104 are set forth below. In a circumstance in which the gateway 502 determined (at the decision step 704) that the Web API call 104 lacked call policy information 106, the gateway 502 may determine (at the decision step 710) to grant leniency to the client 202 because, for example, the Web API call 104 may have been the initial Web API call 104 the client 202 has made to the web service 102 and the Web API call 104 may lack a policy token 602 simply because that client 202 has not yet been provided with a policy token 602 (e.g., together with a prior API response 506) to include in that call.

The gateway 502 may additionally or alternatively determine to grant leniency to a client device 202 for a non-compliant Web API call 104 (either for failure to include call policy information 106 in a Web API call 104 per the decision step 704, or for other reasons for non-compliance, e.g., as discussed below in connection with decision steps 706 and 708) in any of a number of other circumstances. For instance, in some implementations, the gateway 502 may allow respective clients 202 a certain number of non-compliant calls before determining to reject the Web API call 104 (per a step 712—as described below) rather than passing it along to an appropriate service, e.g., one of the microservices 504 shown in FIG. 5, for processing (per a step 714—as described below). The gateway 502 may thus determine (at the decision step 710) to grant leniency to a client 202 if that client has not net exceeded its allotted number of non-compliant calls.

As another example, the gateway 502 may additionally or alternatively take into account the current expected load on the web service 102 when determining (at the decision step 710) whether to grant leniency with respect to a particular non-compliant Web API call 104. With reference to FIG. 8 (described in more detail below), for example, the gateway 502 may consult the table 800 to compare a value of a “scheduled count” entry 810 for the current time slot (e.g., as indicated by an arrow 802 in FIG. 8) with an estimated number of calls that are expected to occur within the current time slot (e.g., by referencing one or more historical “actual count” entries 808 for one or more comparable time slots, such as the same slot on the same day of the week for one or more prior weeks) to determine whether the capacity of the web service 102 is likely to be exceeded if the non-compliant call is allowed to be processed. If the gateway 502 determines that a particular non-compliant Web API call 104 is unlikely to have an adverse impact on clients 202 making compliant calls (e.g., because the value of the “scheduled count” entry 810 for the current time slot is more than a threshold amount below the estimated number of calls that are expected to occur within the current time slot), the gateway 502 may determine (at the decision step 710) to grant leniency to the Web API call 104 currently under consideration. Further, in some implementations, the gateway 502 may additionally or alternatively take into account a priority status assigned to the client 202 that issued the non-compliant call when determining (at the decision step 710) whether to grant leniency with respect to the call currently under consideration, such that higher-priority clients 202 are more likely to be granted leniency than lower priority clients. Any of a number of additional or different criteria for determining to whether to grant leniency for a non-compliant Web API call 104 (per the decision step 710) are also possible and contemplated.

As noted previously, at the decision step 706 of the routine 700, the gateway 502 may determine whether the call policy information 106 included with a received Web API call 104 is authentic. In some implementations, for example, the call policy information 106 may be encoded within the header 604 and/or payload 606 a policy token 602 (e.g., a JWT) included with the received Web API call 104, and the gateway 502 may use its private key and the signing technique identified in the header 604 to process the content of the header 604 and payload 606 to determine whether the signature 608 is valid. In some implementations, for example, the gateway 502 may calculate a signature by (A) combining the Base64url encoded header and the Base64url encoded payload, (B) hashing the combined Based64url value with a hashing technique, e.g., SHA256, and (C) encrypting the determined hash using the private key). If the calculated signature matches the signature 608 of the token 602, the gateway 502 may determine the signature 608 is valid. In such implementations, the gateway 502 may thus determine that the call policy information 106 is authentic based on the presence of a valid signature 608 in the received policy token 602. As noted above, the existence of a valid signature 608 may give the gateway 502 confidence that the header 604 and payload 606 have not been altered since the gateway 502 originally generated and signed the policy token 602. Example techniques for generating new policy tokens 602 for distribution to clients are described below in connection with steps 720, 726, and 728 of the routine 700.

When, at the decision step 706 of the routine 700, the gateway 502 determines that the call policy information 106 is not authentic (because a signature 608 is absent or determined to be invalid, or for some other reason), the routine 700 may proceed to the decision step 710, at which the gateway 502 may determine whether to grant leniency for the non-compliant Web API call 104, for any of the reasons noted above or otherwise. When, on the other hand, the gateway 502 determines (at the decision step 706) that the call policy information 106 is authentic (by validating the signature 608 in a policy token 602 or otherwise), the routine 700 may instead proceed to the decision step 708, at which the gateway 502 may determine whether the Web API call 104 is compliant with the policy or policies indicated in the call policy information 106.

As noted above, the call policy information 106 can identify any of a number criteria that a Web API call 104 must meet in order to be deemed “compliant.” With reference to FIG. 6, for example, in some implementations, Web API call 104 may be deemed “non-compliant” if the gateway 502 determines that (A) the Web API call 104 was not issued within an allowed slot 610 a or in connection with indicated event trigger 610 c, and/or (B) the Web API call 104 was issued within a blocked slot 610 b.

When, at the decision step 708 of the routine 700, the gateway 502 determines that the Web API call 104 does not comply with the call policy information 106, the routine 700 may proceed to the decision step 710, at which the gateway 502 may determine whether to grant leniency for the non-compliant Web API call 104, for any of the reasons noted above or otherwise. When, on the other hand, the gateway 502 determines (at the decision step 708) that the Web API call 104 does comply with the call policy information 106, the routine 700 may instead proceed to the step 714 at which the gateway 502 may forward the Web API call 104 to an appropriate service (e.g., one of the microservices 504) for processing.

At a decision step 716 of the routine 700, the gateway 502 may determine whether the Web API call 104 was successfully processed by the service to which it was forwarded (per the step 714). With reference to FIG. 5, such a determination may be made, for example, based on the API response 506 (which may include an error message) that the gateway 502 receives from the microservice 504 to which it sent the Web API call 104.

When, at the decision step 716 of the routine 700, the gateway 502 determines that the Web API call 104 was successfully handled by the service to which it was forwarded, the routine 700 may proceed to a step 718, at which the gateway 502 may increment the value of the “actual count” entry 808 in the table 800 (shown in FIG. 8) for the current time slot (as indicated by the arrow 802 in FIG. 8) by “1”. As shown in FIG. 8, in some implementations, individual time slots in the table 800 may be indicated by a “slot start date & time” entry 804 and a “slot end date & time” entry 806, which identify the date/time at which the indicated time slot begins and the date/time at which the indicated time slot ends, respectively. The table 800 may maintain historical records of the “actual count” entries 808 for any desired period of time in the past.

In some implementations, the values of the “actual count” entries 808 may allow the gateway 502 to determine how to best scale the web service 102 to meet expected demands during corresponding time slots in the future. For example, in some implementations, the web service 102 may use the values of the historical “actual count” entries 808 to proactively determine how to provision the capacity of the web service 102 for future time periods, such as by increasing the capacity during business hours where the “actual count” entries 808 indicate a tendency for higher loads and decreasing the capacity during non-business hours where the “actual count” entries 808 indicate a tendency for lower loads. Further, in some implementations, the values of the “actual count” entries 808 may additionally or alternatively allow the gateway 502 to determine how the system is likely to be scaled during particular time slots for purposes of determining how best to set the call policy information 106, as explained in more detail below.

At a step 720 of the routine 700, the gateway 502 may generate new call policy information 106 for the next Web API call 104 the client 202 is to make to the web service 102. In some implementations, generating such policy information may involve constructing a new policy token 602 to append to the API response 506 that is to be sent to the client 202, as illustrated in FIG. 6. In some implementations, the new call policy information 106 may be determined based on the data values in in the table 800 (shown in FIG. 8), as explained in more detail below.

At a step 722 of the routine 700, the gateway 502 may update the values of the “scheduled count” entries 810 in the table 800 based on the new call policy information 106 determined at the step 720 (or at steps 726 or 728, as described below). In some implementations, such updates to the “scheduled count” entries 810 may be made to reflect a prediction as to when the client 202 is likely to send its next Web API call 104 based on the call policy information 106 that was determined at the step 720. In some circumstances, the gateway 502 may determine to increment the value of a “scheduled count” entry 810 for only a single time slot in the future. For instance, if the new call policy information 106 determined at the step 720 includes only a single allowed slot 610 a, then the gateway 502 may decide to increment the value of the “scheduled count” entry 810 for just that one allowed slot 610 a by “1”. In other circumstances, the gateway 502 may instead determine to increment the values of “scheduled count” entry 810 for multiple future time slots based on the new call policy information 106 determined at the step 720. For example, if the new call policy information 106 determined at the step 720 includes two allowed slots 610 a, then the gateway 502 may determine to update the values of the “scheduled count” entries 810 for each of those two time slots by “0.5”. As another example, if the new call policy information 106 determined at the step 720 identifies three of twenty-four possible time slots for the client device 202 to send its next Web API call 104 as blocked slots 610 b, the gateway 502 may determine to update the values of the “scheduled count” entries 810 for the remaining twenty-one possible time slots by “1/21”. Further, as discussed in more detail below, in some implementations, the gateway 502 may determine (at the step 720) to assign different weights to different allowed slots 610 a that it includes in new call policy information 106. In such circumstances, at the step 722, the gateway 502 may apply corresponding weights to the values by which the “scheduled count” entries 810 are updated, so as to accurately reflect the likelihood that the recipient client 202 is going to select particular time slots for sending its next Web API call 104.

In any event, referring back to the step 720, in some implementations, the new call policy information 106 the gateway 502 determines for the client 202 at that step may depend on the current values of the “scheduled count” entries 810 at the time the gateway 502 makes such a determination. As noted previously, the current values of the “scheduled count” entries 810 in the table 800 may reflect estimates of the number of Web API calls 104 the web service 102 will receive from clients 202 during the respective time slots that follow the current time slot (e.g., as indicated by the arrow 802). As the gateway 502 issues new call policy information 106 directing, or at least urging, respective clients 202 to issue their next Web API calls 104 during particular future time slots, the gateway 502 may adjust those estimates accordingly (as discussed above in connection with the step 722). As such, the gateway 502 may, at the step 720, generate new call policy information 106 that directs, or at least urges, the recipient client 202 to utilize one or more future time slots that are estimated to be less busy than the others.

In some implementations, when determining the new call policy information 106 for a client at the step 720, the gateway 502 may further take into account (A) any grace period that client 202 is allotted for satisfying a periodic Web API call requirement, if such a requirement exists, and/or (B) the values of historical “actual count” entries 808, as those values may enable the gateway 502 to determine how the system is likely to be scaled during particular time slots for purposes of determining how best to set the call policy information 106. For example, in some implementations, when determining the call policy data 106 in a given instance, the gateway 502 may refrain from determining call policy data 106 that would raise the “scheduled count” entry 810 for a particular time slot beyond the expected scaled capacity of the web service 102 during that time slot. In some implementations, the gateway 502 may further determine the call policy data 106 so as to ensure that a “buffer value” remains between the value of a “scheduled count” entry 810 for a particular time slot and the expected scaled capacity of the web service 102 during that time slot. Leaving such a buffer value may, for example, provide the gateway 502 with sufficient leeway to provide leniency for at least certain types of non-compliant Web API calls 104 (e.g., per the decision step 710), such as for Web API calls 104 that are determined (e.g., per the decision step 704) not to include call policy information 106 because the clients 202 are making Web API calls 104 for the first time, or otherwise.

Any of a number of technique and/or criteria may be utilized by the gateway 502 to optimize the call policy information 106 (determined at the step 720) depending on the current values of the “scheduled count” entries 810, and possibly other information (as noted above) that is accessible to the gateway 502. In some implementations, for example, the gateway 502 may identify a number of allowed slots 610 a corresponding to future time slots that are estimated to be less busy than others, and may additionally apply weights to those allowed slots 610 a in accordance to the relative values of the “scheduled count” entries 810 for those time slots. By employing such an approach, the gateway 502 may “nudge” individual clients into selecting the least busy time slot in a group but also allow those clients 202 leeway to select an alternate time slot that, although not ideal, would be suitable for the web service 102.

In some implementations, the new call policy information 106 the gateway 502 generates at the step 720 may additionally include one or more retry policies 612, so that the recipient client 202 knows how best to handle retry attempts, as discussed above, in the event the next Web API call 104 it attempts is unsuccessful.

At a step 724 of the routine 700, the gateway 502 may send the new call policy information 106 (determined at the step 720, or at the step 726 or 728, as discussed below) to the client device 202 from which the Web API call 104 was received at the decision step 702. For example, in implementations in which the call policy information 106 is encoded in a policy token 602, the generated policy token 602 may be appended to the API response 506 that the gateway 502 sends back to the client 202.

Referring again to the decision step 716 of the routine 700, when the gateway 502 determines (at the decision step 716) that the Web API call 104 was not successfully handled by the service to which it was forwarded, the routine 700 may proceed to a step 726, at which the gateway 502 may generate new call policy information 106 so that the recipient client 202 can be provided with one or more new next call policies 610 and/or one or more new retry policies 612 to mandate, or at least urge, the client 202 to make retry attempts during one or more particular time slots and/or employ a retry technique in a particular fashion, such as by specifying a maximum number of retry attempts, specifying whether to use a linear or exponential back-off technique, etc. The processes and information used to generate the new call policy information 106 at the step 726 may be essentially the same as those described above in connection with the step 720. The primary difference between the steps 726 and 720 is that the new call policy information 106 determined at the step 726 may be specifically configured to guide the client 202 to make a successful retry attempt after the web service 102 was unable to successfully process a Web API call 104, while at the same time maintaining a balanced load for the web service 102.

In the event that the gateway 502 rejects a Web API call 104 (at the step 712) after the gateway 502 has determined not to grant leniency to the client 202 (per the decision step 710) for a non-compliant Web API call 104, the routine 700 may proceed to a step 728, at which at which the gateway 502 may generate new call policy information 106 so that the recipient client 202 can be provided with one or more new next call policies 610 and/or one or more new retry policies 612 to mandate, or at least urge, the client 202 to make retry attempts during one or more particular time slots and/or employ a retry technique in a particular fashion, such as by specifying a maximum number of retry attempts, specifying whether to use a linear or exponential back-off technique, etc. The processes and information used to generate the new call policy information 106 at the step 728 may be essentially the same as those described above in connection with the steps 720 and 726. The primary difference between the steps 728 and the steps 720 and 726 is that the new call policy information 106 determined at the step 728 may be specifically configured to guide the client 202 to make a successful retry attempt after a Web API call has been rejected by the gateway 502 for non-compliance with one or more policies while at the same time maintaining a balanced load for the web service 102.

In some implementations, the implementation of the steps 722 and 724 following either of the steps 726 and 728 may be the same as or similar to the implementation of those steps following the step 720. The new call policy information 106 processed by the steps 722 and 724 may, however, be different when that new call policy information 106 is generated by the respective steps 720, 726, 728. Further, as indicated in the text of the step 724, in some circumstances, e.g., when a call is rejected by the gateway 502 per the step 712, the new call policy information 106 may be included as part of an error message, rather than as part of an API response 506, in some implementations.

A illustrative example of a scenario in which and intelligent Web API call management system 100, such as described herein, may be employed will now be described. Consider a server which wants to uniformly distribute the requests (e.g., Web API calls 104) it receives from clients over a twenty-four hour time period. Depending on the provisioned capacity, the server may determine the optimal number of requests it can serve in a one hour window. On the server side, a first data structure (e.g., “scheduled count” entries 810 of the table 800) may keep the call schedule for a future day. The schedule may be divided into twenty-four slots (keyed by the hour) and the value for each such slot may be a numerical count initialized to zero. The server may also maintain a second, similar data structure (e.g., the “actual count” entries 808 of the table 800) to track the requests that are actually being processed as the day progresses. For every request successfully processed, the count for the corresponding slot in the second data structure may be incremented. For a dropped request, the count for the corresponding slot in the second data structure is not incremented. This second data structure may serve the purpose of knowing the historical pattern of requests and identifying patterns (peak periods and lean periods).

For every request successfully processed, the server may include a token (e.g., a policy token 602) in the header or payload of the response, e.g., an API response 506. The token may identify an allowed slot (e.g., an allowed slot 610 a indicated in a policy token 602) for the future date during which the next request can be issued from the client and/or one or more blocked slots (e.g., one or more blocked slots 610 b indicated in a policy token 602) which are to be avoided. These slots may calculated based on the first data structure (representing the schedule for the future day) maintained by the server, the grace window allowed for the client, and the second data structure (representing the historical traffic pattern of requests seen by the server). Upon issuing the token, the server may modify the first data structure (representing the schedule for the future day) to increment the count for the slot(s) assigned to the client in the token. The token may indicate the slots assigned by the server and may be signed to avoid tampering. The token may also include a client identifier for making it non-transferable to other client devices 202. The protocol that the client device 202 is expected to follow to comply with the time slots designated by the server may be indicated in the token. While making the next call, the client may follow the schedule and may include the token in the call to the server. The server may evaluate the content of received token and determine whether the client has violated the guidance or complied with it. The server may honor the calls which are made within the allowed slots and/or drop the ones which are sent during blocked slots. The server may also choose to be lenient for a few requests and become stricter if the non-compliance pattern continues.

For requests the server can't process due to current load, the server may consult the near-term schedule (e.g., as represented by the first data structure) within the grace window of the caller and provide a retry token with an allowed slot which has capacity, and/or with one or more blocked slots marked so that the client can do a smart retry and thus increase its chances of success.

With such an approach, clients may be guided by the server to generate a traffic pattern which is uniformly distributed, avoids spikes/denial of service and at the same time make best use of server resources without affecting the business capability. Further, in some such implementations, the server need only know the count of the calls it has scheduled in a future slot (e.g., as represented by the first data structure) and not the exact clients scheduled therein.

While the foregoing example references one hour slots, any other suitable time interval may instead be employed. The same method may work for finer time slots (minutes, seconds etc.) as well as coarser ones, such as days. Similarly, the server may identify multiple allowed slots with different weightages for the client to choose from to give more flexibility to the client but at the same time centrally shape the traffic pattern.

F. Example Implementations of Methods, Systems, and Computer-Readable Media in Accordance with the Present Disclosure

The following paragraphs (M1) through (M29) describe examples of methods that may be implemented in accordance with the present disclosure.

(M1) A method may be performed that involves sending, from a computing system to a client device, a first token that includes first data indicating that the client device is authorized to send an application programming interface (API) call to the computing system during at least a first time slot, the first token including a first signature based on the first data and a private key of the computing system; receiving, by the computing system and from the client device, a first API call during the first time slot; determining, by the computing system, that the first API call includes a second token, the second token including second data and a second signature; determining, by the computing system and using the private key and the second data, that the second signature is valid; determining, by the computing system, that the second data indicates that the client device was authorized to send the first API call to the computing system during the first time slot; and processing, by the computing system, the first API call based at least in part the second data indicating that the client device was authorized to send the first API call to the computing system during the first time slot.

(M2) A method may be performed as described in paragraph (M1), and may further involve receiving, by the computing system and from the client device, a second API call prior to receiving the first API call; including, by the computing system, the first token in a response to the second API call; and sending, by the computing system, the response to the client device.

(M3) A method may be performed as described in paragraph (M1) or paragraph (M2), and may further involve generating, by the computing system, the first data based at least in part on stored data that is indicative of a first estimated capacity of the computing system to process API calls during the first time slot.

(M4) A method may be performed that involves sending, from a computing system to a client device, first data indicating that the client device is authorized to send an application programming interface (API) call to the computing system during at least a first time slot; receiving, by the computing system and from the client device, a first API call during the first time slot; and processing, by the computing system, the first API call.

(M5) A method may be performed as described in paragraph (M4), and may further involve receiving, by the computing system and from the client device, a second API call prior to receiving the first API call; including, by the computing system, the first data in a response to the second API call; and sending, by the computing system, the response to the client device.

(M6) A method may be performed as described in paragraph (M4) or paragraph (M5), and may further involve determining, by the computing system, that the first API call was received from the client device during the first time slot; and determining, by the computing system, to process the first API call based at least in part on the first API call having been received during the first time slot.

(M7) A method may be performed as described in paragraph (M6), and may further involve determining, by the computing system, that the first API call includes second data indicative of the client device having been authorized to send the first API call to the computing system during the first time slot; wherein determining to process the first API call may be further based at least on part on the second data being indicative of the client device having been authorized to send the first API call to the computing system during the first time slot.

(M8) A method may be performed as described in paragraph (M7), and may further involve determining, by computing system, that the second data matches the first data; wherein determining to process the first API call may be further based at least on part on the second data matching the first data.

(M9) A method may be performed as described in any of paragraphs (M4) through (M8), and may further involve generating, by the computing system, a token that represents the first data and includes a first signature based on the first data and a private key of the computing system; wherein the computing system may send the first data to the client device as a part of the token.

(M10) A method may be performed as described in any of paragraphs (M4) through (M9), and may further involve determining, by the computing system, that the first API call includes a token; and determining that content of the token indicates that the client device has been authorized to send the first API call to the computing system during the first time slot; wherein determining to process the first API call may be further based at least on part on the content of the token indicating that the client device has been authorized to send the first API call to the computing system during the first time slot.

(M11) A method may be performed as described in paragraph (M10), and may further involve determining, by the computing system and using a private key of the computing system and content of the token, that a signature of the token is valid; wherein determining to process the first API call may be further based at least on part on the signature of the token being valid.

(M12) A method may be performed as described in any of paragraphs (M4) through (M11), and may further involve generating, by the computing system, the first data based at least in part on stored data that is indicative of a first estimated capacity of the computing system to process API calls during the first time slot.

(M13) A method may be performed as described in paragraph (M12), and may further involve updating the stored data to indicate a reduced availability of the first time slot to service API calls in response to generating the first data.

(M14) A method may be performed as described in paragraph (M12) or paragraph (M13), wherein the stored data may be further indicative of a second estimated capacity of the computing system to process API calls during a second time slot, and the method may further involve determining, by the computing system, that the first estimated capacity is greater than the second estimated capacity; and configuring, by the computing system and based at least on the first estimated capacity being greater than the second estimated capacity, the first data to indicate that the client device is authorized to send an API call to the computing system during the first time slot rather than the second time slot.

(M15) A method may be performed as described in any of paragraphs (M4) through (M14), and may further involve determining a frequency with which the client device is to make API calls; determining a grace period for the client device to make web API calls with the frequency; and generating, by the computing system, the first data based at least in part on the frequency and the grace period.

(M16) A method may be performed as described in any of paragraphs (M4) through (M14), wherein the first data may further indicate how the client device is to go about sending an additional API call if an attempted API call fails.

(M17) A method may be performed that involves sending, from a computing system to a client device, first data indicating that the client device is prohibited from sending an API call to the computing system during at least a first time slot; receiving, by the computing system and from the client device, a first API call during a second time slot that is different than the first time slot; and processing, by the computing system, the first API call.

(M18) A method may be performed as described in paragraph (M17), and may further involve receiving, by the computing system and from the client device, a second API call prior to receiving the first API call; including, by the computing system, the first data in a response to the second API call; and sending, by the computing system, the response to the client device.

(M19) A method may be performed as described in paragraph (M17) or paragraph (M18), and may further involve determining, by the computing system, that the first API call was received from the client device outside of the first time slot; and determining, by the computing system, to process the first API call based at least in part on the first API call having been received outside of the first time slot.

(M20) A method may be performed as described in paragraph (M19), and may further involve determining, by the computing system, that the first API call includes second data indicative of the client device having been prohibited from sending an API call to the computing system during the first time slot but not having been prohibited from sending an API call to the computing system during the second time slot; wherein determining to process the first API call may be further based at least on part on the second data being indicative of the client device not having been prohibited from sending an API call to the computing system during the second time slot.

(M21) A method may be performed as described in paragraph (M20), and may further involve determining, by computing system, that the second data matches the first data; wherein determining to process the first API call may be further based at least on part on the second data matching the first data.

(M22) A method may be performed as described in any of paragraphs (M17) through (M21), and may further involve generating, by the computing system, a token that represents the first data and includes a first signature based on the first data and a private key of the computing system; wherein the computing system may send the first data to the client device as a part of the token.

(M23) A method may be performed as described in any of paragraphs (M17) through (M22), and may further involve determining, by the computing system, that the first API call includes a token; and determining that content of the token indicates that the client device has been prohibited from sending an API call to the computing system during the first time slot but has not been prohibited from sending an API call to the computing system during the second time slot; wherein determining to process the first API call may be further based at least on part on the content of the token indicating that the client device has not been prohibited from sending an API call to the computing system during the second time slot.

(M24) A method may be performed as described in paragraph (M23), and may further involve determining, by the computing system and using a private key of the computing system and content of the token, that a signature of the token is valid; wherein determining to process the first API call may be further based at least on part on the signature of the token being valid.

(M25) A method may be performed as described in any of paragraphs (M17) through (M24), and may further involve generating, by the computing system, the first data based at least in part on stored data that is indicative of a first estimated capacity of the computing system to process API calls during the first time slot.

(M26) A method may be performed as described in paragraph (M25), and may further involve updating the stored data to indicate a reduced availability of the second time slot to service API calls in response to generating the first data.

(M27) A method may be performed as described in paragraph (M25) or paragraph (M26), wherein the stored data may be further indicative of a second estimated capacity of the computing system to process API calls during the second time slot, and the method may further involve determining, by the computing system, that the first estimated capacity is lower than the second estimated capacity; and configuring, by the computing system and based at least on the first estimated capacity being lower than the second estimated capacity, the first data to indicate that the client device is prohibited from sending an API call to the computing system during the first time slot but is not prohibited from sending an API call to the computing system during the second time slot.

(M28) A method may be performed as described in any of paragraphs (M17) through (M27), and may further involve determining a frequency with which the client device is to make API calls; determining a grace period for the client device to make web API calls with the frequency; and generating, by the computing system, the first data based at least in part on the frequency and the grace period.

(M29) A method may be performed as described in any of paragraphs (M17) through (M28), wherein the first data may further indicate how the client device is to go about sending an additional API call if an attempted API call fails.

The following paragraphs (S1) through (S29) describe examples of systems and devices that may be implemented in accordance with the present disclosure.

(S1) A system may comprise at least one processor and at least one computer-readable medium encoded with instructions which, when executed by the at least one processor, cause the system to send to a client device, a first token that includes first data indicating that the client device is authorized to send an application programming interface (API) call to the system during at least a first time slot, the first token including a first signature based on the first data and a private key of the system, to receive from the client device, a first API call during the first time slot, to determine that the first API call includes a second token, the second token including second data and a second signature, to determine, using the private key and the second data, that the second signature is valid, to determine that the second data indicates that the client device was authorized to send the first API call to the system during the first time slot, and to process the first API call based at least in part the second data indicating that the client device was authorized to send the first API call to the system during the first time slot.

(S2) A system may be configured as described in paragraph (S1), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to receive, from the client device, a second API call prior to receiving the first API call, to include the first token in a response to the second API call, and to send the response to the client device.

(S3) A system may be configured as described in paragraph (S1) or paragraph (S2), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate the first data based at least in part on stored data that is indicative of a first estimated capacity of the system to process API calls during the first time slot.

(S4) A system may comprise at least one processor and at least one computer-readable medium encoded with instructions which, when executed by the at least one processor, cause the system to send, to a client device, first data indicating that the client device is authorized to send an application programming interface (API) call to the system during at least a first time slot, to receive, from the client device, a first API call during the first time slot, and to process the first API call.

(S5) A system may be configured as described in paragraph (S4), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to receive, from the client device, a second API call prior to receiving the first API call, to include the first data in a response to the second API call, and to send the response to the client device.

(S6) A system may be configured as described in paragraph (S4) or paragraph (S5), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call was received from the client device during the first time slot, and to determine to process the first API call based at least in part on the first API call having been received during the first time slot.

(S7) A system may be configured as described in paragraph (S6), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call includes second data indicative of the client device having been authorized to send the first API call to the system during the first time slot, and to determine to process the first API call further based at least on part on the second data being indicative of the client device having been authorized to send the first API call to the system during the first time slot.

(S8) A system may be configured as described in paragraph (S7), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the second data matches the first data, and to determine to process the first API call further based at least on part on the second data matching the first data.

(S9) A system may be configured as described in any of paragraphs (S4) through (S8), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate a token that represents the first data and includes a first signature based on the first data and a private key of the system, and to send the first data to the client device as a part of the token.

(S10) A system may be configured as described in any of paragraphs (S4) through (S9), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call includes a token, to determine that content of the token indicates that the client device has been authorized to send the first API call to the system during the first time slot, and to determine to process the first API call further based at least on part on the content of the token indicating that the client device has been authorized to send the first API call to the system during the first time slot.

(S11) A system may be configured as described in paragraph (S10), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine, using a private key of the system and content of the token, that a signature of the token is valid, and to determine to process the first API call further based at least on part on the signature of the token being valid.

(S12) A system may be configured as described in any of paragraphs (S4) through (S11), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate the first data based at least in part on stored data that is indicative of a first estimated capacity of the system to process API calls during the first time slot.

(S13) A system may be configured as described in paragraph (S12), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to update the stored data to indicate a reduced availability of the first time slot to service API calls in response to generating the first data.

(S14) A system may be configured as described in paragraph (S12) or paragraph (S13), wherein the stored data may be further indicative of a second estimated capacity of the system to process API calls during a second time slot, and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first estimated capacity is greater than the second estimated capacity, and to configure the first data to indicate that the client device is authorized to send an API call to the system during the first time slot rather than the second time slot based at least on the first estimated capacity being greater than the second estimated capacity.

(S15) A system may be configured as described in any of paragraphs (S4) through (S14), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine a frequency with which the client device is to make API calls, to determine a grace period for the client device to make web API calls with the frequency, and to generate the first data based at least in part on the frequency and the grace period.

(S16) A system may be configured as described in any of paragraphs (S4) through (S14), wherein the first data may further indicate how the client device is to go about sending an additional API call if an attempted API call fails.

(S17) A system may comprise at least one processor and at least one computer-readable medium encoded with instructions which, when executed by the at least one processor, cause the system to send, to a client device, first data indicating that the client device is prohibited from sending an API call to the system during at least a first time slot, to receive from the client device, a first API call during a second time slot that is different than the first time slot, and to process the first API call.

(S18) A system may be configured as described in paragraph (S17), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to receive, from the client device, a second API call prior to receiving the first API call, to include the first data in a response to the second API call, and to send the response to the client device.

(S19) A system may be configured as described in paragraph (S17) or paragraph (S18), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call was received from the client device outside of the first time slot, and to determine to process the first API call based at least in part on the first API call having been received outside of the first time slot.

(S20) A system may be configured as described in paragraph (S19), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call includes second data indicative of the client device having been prohibited from sending an API call to the system during the first time slot but not having been prohibited from sending an API call to the system during the second time slot, and to determine to process the first API call further based at least on part on the second data being indicative of the client device not having been prohibited from sending an API call to the system during the second time slot.

(S21) A system may be configured as described in paragraph (S20), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the second data matches the first data, and to determine to process the first API call further based at least on part on the second data matching the first data.

(S22) A system may be configured as described in any of paragraphs (S17) through (S21), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate a token that represents the first data and includes a first signature based on the first data and a private key of the system, and to send the first data to the client device as a part of the token.

(S23) A system may be configured as described in any of paragraphs (S17) through (S22), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call includes a token, to determine that content of the token indicates that the client device has been prohibited from sending an API call to the system during the first time slot but has not been prohibited from sending an API call to the system during the second time slot, and to determine to process the first API call may be further based at least on part on the content of the token indicating that the client device has not been prohibited from sending an API call to the system during the second time slot.

(S24) A system may be configured as described in paragraph (S23), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine, using a private key of the system and content of the token, that a signature of the token is valid, and to determine to process the first API call further based at least on part on the signature of the token being valid.

(S25) A system may be configured as described in any of paragraphs (S17) through (S24), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate the first data based at least in part on stored data that is indicative of a first estimated capacity of the system to process API calls during the first time slot.

(S26) A system may be configured as described in paragraph (S25), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to update the stored data to indicate a reduced availability of the second time slot to service API calls in response to generating the first data.

(S27) A system may be configured as described in paragraph (S25) or paragraph (S26), wherein the stored data may be further indicative of a second estimated capacity of the system to process API calls during the second time slot, and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first estimated capacity is lower than the second estimated capacity, and to configure the first data to indicate that the client device is prohibited from sending an API call to the system during the first time slot but is not prohibited from sending an API call to the system during the second time slot based at least on the first estimated capacity being lower than the second estimated capacity.

(S28) A system may be configured as described in any of paragraphs (S17) through (S27), and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine a frequency with which the client device is to make API calls, to determine a grace period for the client device to make web API calls with the frequency, and to generate the first data based at least in part on the frequency and the grace period.

(S29) A system may be configured as described in any of paragraphs (S17) through (S28), wherein the first data may further indicate how the client device is to go about sending an additional API call if an attempted API call fails.

The following paragraphs (CRM1) through (CRM29) describe examples of computer-readable media that may be implemented in accordance with the present disclosure.

(CRM1) At least one non-transitory computer-readable medium may be encoded with instructions which, when executed by at least one processor of a system, cause the system to send to a client device, a first token that includes first data indicating that the client device is authorized to send an application programming interface (API) call to the system during at least a first time slot, the first token including a first signature based on the first data and a private key of the system, to receive from the client device, a first API call during the first time slot, to determine that the first API call includes a second token, the second token including second data and a second signature, to determine, using the private key and the second data, that the second signature is valid, to determine that the second data indicates that the client device was authorized to send the first API call to the system during the first time slot, and to process the first API call based at least in part the second data indicating that the client device was authorized to send the first API call to the system during the first time slot.

(CRM2) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM1), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to receive, from the client device, a second API call prior to receiving the first API call, to include the first token in a response to the second API call, and to send the response to the client device.

(CRM3) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM1) or paragraph (CRM2), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate the first data based at least in part on stored data that is indicative of a first estimated capacity of the system to process API calls during the first time slot.

(CRM4) At least one non-transitory computer-readable medium may be encoded with instructions which, when executed by at least one processor of a system, cause the system to send, to a client device, first data indicating that the client device is authorized to send an application programming interface (API) call to the system during at least a first time slot, to receive, from the client device, a first API call during the first time slot, and to process the first API call.

(CRM5) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM4), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to receive, from the client device, a second API call prior to receiving the first API call, to include the first data in a response to the second API call, and to send the response to the client device.

(CRM6) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM4) or paragraph (CRM5), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call was received from the client device during the first time slot, and to determine to process the first API call based at least in part on the first API call having been received during the first time slot.

(CRM7) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM6), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call includes second data indicative of the client device having been authorized to send the first API call to the system during the first time slot, and to determine to process the first API call further based at least on part on the second data being indicative of the client device having been authorized to send the first API call to the system during the first time slot.

(CRM8) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM7), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the second data matches the first data, and to determine to process the first API call further based at least on part on the second data matching the first data.

(CRM9) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM4) through (CRM8), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate a token that represents the first data and includes a first signature based on the first data and a private key of the system, and to send the first data to the client device as a part of the token.

(CRM10) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM4) through (CRM9), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call includes a token, to determine that content of the token indicates that the client device has been authorized to send the first API call to the system during the first time slot, and to determine to process the first API call further based at least on part on the content of the token indicating that the client device has been authorized to send the first API call to the system during the first time slot.

(CRM11) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM10), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine, using a private key of the system and content of the token, that a signature of the token is valid, and to determine to process the first API call further based at least on part on the signature of the token being valid.

(CRM12) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM4) through (CRM11), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate the first data based at least in part on stored data that is indicative of a first estimated capacity of the system to process API calls during the first time slot.

(CRM13) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM12), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to update the stored data to indicate a reduced availability of the first time slot to service API calls in response to generating the first data.

(CRM14) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM12) or paragraph (CRM13), wherein the stored data may be further indicative of a second estimated capacity of the system to process API calls during a second time slot, and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first estimated capacity is greater than the second estimated capacity, and to configure the first data to indicate that the client device is authorized to send an API call to the system during the first time slot rather than the second time slot based at least on the first estimated capacity being greater than the second estimated capacity.

(CRM15) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM4) through (CRM14), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine a frequency with which the client device is to make API calls, to determine a grace period for the client device to make web API calls with the frequency, and to generate the first data based at least in part on the frequency and the grace period.

(CRM16) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM4) through (CRM14), wherein the first data may further indicate how the client device is to go about sending an additional API call if an attempted API call fails.

(CRM17) At least one non-transitory computer-readable medium may be encoded with instructions which, when executed by at least one processor of a system, cause the system to send, to a client device, first data indicating that the client device is prohibited from sending an API call to the system during at least a first time slot, to receive from the client device, a first API call during a second time slot that is different than the first time slot, and to process the first API call.

(CRM18) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM17), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to receive, from the client device, a second API call prior to receiving the first API call, to include the first data in a response to the second API call, and to send the response to the client device.

(CRM19) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM17) or paragraph (CRM18), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call was received from the client device outside of the first time slot, and to determine to process the first API call based at least in part on the first API call having been received outside of the first time slot.

(CRM20) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM19), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call includes second data indicative of the client device having been prohibited from sending an API call to the system during the first time slot but not having been prohibited from sending an API call to the system during the second time slot, and to determine to process the first API call further based at least on part on the second data being indicative of the client device not having been prohibited from sending an API call to the system during the second time slot.

(CRM21) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM20), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the second data matches the first data, and to determine to process the first API call further based at least on part on the second data matching the first data.

(CRM22) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM17) through (CRM21), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate a token that represents the first data and includes a first signature based on the first data and a private key of the system, and to send the first data to the client device as a part of the token.

(CRM23) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM17) through (CRM22), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first API call includes a token, to determine that content of the token indicates that the client device has been prohibited from sending an API call to the system during the first time slot but has not been prohibited from sending an API call to the system during the second time slot, and to determine to process the first API call may be further based at least on part on the content of the token indicating that the client device has not been prohibited from sending an API call to the system during the second time slot.

(CRM24) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM23), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine, using a private key of the system and content of the token, that a signature of the token is valid, and to determine to process the first API call further based at least on part on the signature of the token being valid.

(CRM25) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM17) through (CRM24), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to generate the first data based at least in part on stored data that is indicative of a first estimated capacity of the system to process API calls during the first time slot.

(CRM26) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM25), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to update the stored data to indicate a reduced availability of the second time slot to service API calls in response to generating the first data.

(CRM27) At least one non-transitory computer-readable medium may be configured as described in paragraph (CRM25) or paragraph (CRM26), wherein the stored data may be further indicative of a second estimated capacity of the system to process API calls during the second time slot, and the at least one computer-readable medium may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine that the first estimated capacity is lower than the second estimated capacity, and to configure the first data to indicate that the client device is prohibited from sending an API call to the system during the first time slot but is not prohibited from sending an API call to the system during the second time slot based at least on the first estimated capacity being lower than the second estimated capacity.

(CRM28) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM17) through (CRM27), and may be further encoded with additional instructions which, when executed by the at least one processor, further cause the system to determine a frequency with which the client device is to make API calls, to determine a grace period for the client device to make web API calls with the frequency, and to generate the first data based at least in part on the frequency and the grace period.

(CRM29) At least one non-transitory computer-readable medium may be configured as described in any of paragraphs (CRM17) through (CRM28), wherein the first data may further indicate how the client device is to go about sending an additional API call if an attempted API call fails.

Having thus described several aspects of at least one embodiment, it is to be appreciated that various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to be part of this disclosure, and are intended to be within the spirit and scope of the disclosure. Accordingly, the foregoing description and drawings are by way of example only.

Various aspects of the present disclosure may be used alone, in combination, or in a variety of arrangements not specifically discussed in the embodiments described in the foregoing and is therefore not limited in this application to the details and arrangement of components set forth in the foregoing description or illustrated in the drawings. For example, aspects described in one embodiment may be combined in any manner with aspects described in other embodiments.

Also, the disclosed aspects may be embodied as a method, of which an example has been provided. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in illustrative embodiments.

Use of ordinal terms such as “first,” “second,” “third,” etc. in the claims to modify a claim element does not by itself connote any priority, precedence or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claimed element having a certain name from another element having a same name (but for use of the ordinal term) to distinguish the claim elements.

Also, the phraseology and terminology used herein is used for the purpose of description and should not be regarded as limiting. The use of “including,” “comprising,” or “having,” “containing,” “involving,” and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. 

What is claimed is:
 1. A method, comprising: sending, from a computing system to a client device, a first token that includes first data indicating that the client device is authorized to send an application programming interface (API) call to the computing system during at least a first time slot, the first token including a first signature based on the first data and a private key of the computing system; receiving, by the computing system and from the client device, a first API call during the first time slot; determining, by the computing system, that the first API call includes a second token, the second token including second data and a second signature; determining, by the computing system and using the private key and the second data, that the second signature is valid; determining, by the computing system, that the second data indicates that the client device was authorized to send the first API call to the computing system during the first time slot; and processing, by the computing system, the first API call based at least in part the second data indicating that the client device was authorized to send the first API call to the computing system during the first time slot.
 2. The method of claim 1, further comprising: receiving, by the computing system and from the client device, a second API call prior to receiving the first API call; including, by the computing system, the first token in a response to the second API call; and sending, by the computing system, the response to the client device.
 3. The method of claim 1, further comprising: generating, by the computing system, the first data based at least in part on stored data that is indicative of a first estimated capacity of the computing system to process API calls during the first time slot.
 4. A method, comprising: sending, from a computing system to a client device, first data indicating that the client device is authorized to send an application programming interface (API) call to the computing system during at least a first time slot; receiving, by the computing system and from the client device, a first API call during the first time slot; and processing, by the computing system, the first API call.
 5. The method of claim 4, further comprising: receiving, by the computing system and from the client device, a second API call prior to receiving the first API call; including, by the computing system, the first data in a response to the second API call; and sending, by the computing system, the response to the client device.
 6. The method of claim 4, further comprising: determining, by the computing system, that the first API call was received from the client device during the first time slot; and determining, by the computing system, to process the first API call based at least in part on the first API call having been received during the first time slot.
 7. The method of claim 6, further comprising: determining, by the computing system, that the first API call includes second data indicative of the client device having been authorized to send the first API call to the computing system during the first time slot; wherein determining to process the first API call is further based at least on part on the second data being indicative of the client device having been authorized to send the first API call to the computing system during the first time slot.
 8. The method of claim 7, further comprising: determining, by computing system, that the second data matches the first data; wherein determining to process the first API call is further based at least on part on the second data matching the first data.
 9. The method of claim 4, further comprising: generating, by the computing system, a token that represents the first data and includes a first signature based on the first data and a private key of the computing system; wherein the computing system sends the first data to the client device as a part of the token.
 10. The method of claim 4, further comprising: determining, by the computing system, that the first API call includes a token; and determining that content of the token indicates that the client device has been authorized to send the first API call to the computing system during the first time slot; wherein determining to process the first API call is further based at least on part on the content of the token indicating that the client device has been authorized to send the first API call to the computing system during the first time slot.
 11. The method of claim 10, further comprising: determining, by the computing system and using a private key of the computing system and content of the token, that a signature of the token is valid; wherein determining to process the first API call is further based at least on part on the signature of the token being valid.
 12. The method of claim 4, further comprising: generating, by the computing system, the first data based at least in part on stored data that is indicative of a first estimated capacity of the computing system to process API calls during the first time slot.
 13. The method of claim 12, further comprising: in response to generating the first data, updating the stored data to indicate a reduced availability of the first time slot to service API calls.
 14. The method of claim 12, wherein the stored data is further indicative of a second estimated capacity of the computing system to process API calls during a second time slot, and the method further comprises: determining, by the computing system, that the first estimated capacity is greater than the second estimated capacity; and configuring, by the computing system and based at least on the first estimated capacity being greater than the second estimated capacity, the first data to indicate that the client device is authorized to send an API call to the computing system during the first time slot rather than the second time slot.
 15. The method of claim 4, further comprising: determining a frequency with which the client device is to make API calls; determining a grace period for the client device to make web API calls with the frequency; and generating, by the computing system, the first data based at least in part on the frequency and the grace period.
 16. The method of claim 4, wherein the first data further indicates how the client device is to go about sending an additional API call if an attempted API call fails.
 17. A method, comprising: sending, from a computing system to a client device, first data indicating that the client device is prohibited from sending an API call to the computing system during at least a first time slot; receiving, by the computing system and from the client device, a first API call during a second time slot that is different than the first time slot; and processing, by the computing system, the first API call.
 18. The method of claim 17, further comprising: receiving, by the computing system and from the client device, a second API call prior to receiving the first API call; including, by the computing system, the first data in a response to the second API call; and sending, by the computing system, the response to the client device.
 19. The method of claim 17, further comprising: determining, by the computing system, that the first API call was received from the client device outside of the first time slot; and determining, by the computing system, to process the first API call based at least in part on the first API call having been received outside of the first time slot.
 20. The method of claim 19, further comprising: determining, by the computing system, that the first API call includes second data indicative of the client device having been prohibited from sending an API call to the computing system during the first time slot but not having been prohibited from sending an API call to the computing system during the second time slot; wherein determining to process the first API call is further based at least on part on the second data being indicative of the client device not having been prohibited from sending an API call to the computing system during the second time slot.
 21. The method of claim 20, further comprising: determining, by computing system, that the second data matches the first data; wherein determining to process the first API call is further based at least on part on the second data matching the first data.
 22. The method of claim 17, further comprising: generating, by the computing system, a token that represents the first data and includes a first signature based on the first data and a private key of the computing system; wherein the computing system sends the first data to the client device as a part of the token.
 23. The method of claim 17, further comprising: determining, by the computing system, that the first API call includes a token; and determining that content of the token indicates that the client device has been prohibited from sending an API call to the computing system during the first time slot but has not been prohibited from sending an API call to the computing system during the second time slot; wherein determining to process the first API call is further based at least on part on the content of the token indicating that the client device has not been prohibited from sending an API call to the computing system during the second time slot.
 24. The method of claim 23, further comprising: determining, by the computing system and using a private key of the computing system and content of the token, that a signature of the token is valid; wherein determining to process the first API call is further based at least on part on the signature of the token being valid.
 25. The method of claim 17, further comprising: generating, by the computing system, the first data based at least in part on stored data that is indicative of a first estimated capacity of the computing system to process API calls during the first time slot.
 26. The method of claim 25, further comprising: in response to generating the first data, updating the stored data to indicate a reduced availability of the second time slot to service API calls.
 27. The method of claim 25, wherein the stored data is further indicative of a second estimated capacity of the computing system to process API calls during the second time slot, and the method further comprises: determining, by the computing system, that the first estimated capacity is lower than the second estimated capacity; and configuring, by the computing system and based at least on the first estimated capacity being lower than the second estimated capacity, the first data to indicate that the client device is prohibited from sending an API call to the computing system during the first time slot but is not prohibited from sending an API call to the computing system during the second time slot.
 28. The method of claim 17, further comprising: determining a frequency with which the client device is to make API calls; determining a grace period for the client device to make web API calls with the frequency; and generating, by the computing system, the first data based at least in part on the frequency and the grace period.
 29. The method of claim 17, wherein the first data further indicates how the client device is to go about sending an additional API call if an attempted API call fails.
 30. A system, comprising: at least one processor; and at least one computer-readable medium encoded with instructions which, when executed by the at least one processor, cause the system to: send, to a client device, first data indicating that (A) the client device is authorized to send an application programming interface (API) call to the system during at least a first time slot, or (B) the client device is prohibited from sending an API call to the system during at least a second time slot that is different than the first time slot, receive, from the client device, a first API call during the first time slot, and process, by the system, the first API call. 